A statistical coarticulatory model for the hidden vocal-tract-resonance dynamics

نویسندگان

  • Li Deng
  • Jeff Z. Ma
چکیده

A statistical coarticulatory model is presented for spontaneous speech recognition, where knowledge of the dynamic, target-directed behavior in the vocal tract resonance responsible for the production of highly coarticulated speech is incorporated into the recognizer design, training, and in likelihood computation. The principal advantage of the new speech model over the conventional HMM is the use of a compact, internal structure that parsimoniously represents long-span context dependence in the observable domain of speech acoustics without using additional, contextdependent model parameters. The new model is formulated mathematically as a constrained, nonstationary, and nonlinear dynamic system, for which a version of the generalized EM algorithm is developed and implemented for automatically learning the compact set of model parameters. Experiments for speech recognition using spontaneous speech data from SWITCHBOARD corpus are reported.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spontaneous speech recognition using a statistical coarticulatory model for the vocal-tract-resonance dynamics.

A statistical coarticulatory model is presented for spontaneous speech recognition, where knowledge of the dynamic, target-directed behavior in the vocal tract resonance is incorporated into the model design, training, and in likelihood computation. The principal advantage of the new model over the conventional HMM is the use of a compact, internal structure that parsimoniously represents long-...

متن کامل

Coarticulation modeling by embedding a target-directed hidden trajectory model into HMM - model and training

We propose and evaluate a new acoustic model that combines HMM and a special type of the hidden dynamic model (HDM) – a target-directed hidden trajectory model – into a single integrated model named HTHMM. The new model provides a computational model of coarticulation by representing the internal dynamics of human speech based on the hidden trajectory of the vocal-tract resonances. This paper f...

متن کامل

A Generative Modeling Framework for Structured Hidden Speech Dynamics

We outline a structured speech model, as a special and perhaps extreme form of probabilistic generative modeling. The model is equipped with long-contextual-span capabilities that are missing in the HMM approach. Compact (and physically meaningful) parameterization of the model is made possible by the continuity constraint in the hidden vocal tract resonance (VTR) domain. The target-directed VT...

متن کامل

Statistical multi-stream modeling of real-time MRI articulatory speech data

This paper investigates different statistical modeling frameworks for articulatory speech data obtained using real-time (RT) magnetic resonance imaging (MRI). To quantitatively capture the spatio-temporal shaping process of the human vocal tract during speech production a multi-dimensional stream of direct image features is extracted automatically from the MRI recordings. The features are close...

متن کامل

Faster 3d vocal tract real-time MRI using constrained reconstruction

Real-time magnetic resonance imaging (rtMRI) is a valuable emerging tool for studying the dynamics of vocal production. Conventional 2D rtMRI typically images the midsagittal plane of the vocal tract, acquiring data from all the important articulators. Dynamic 3D MRI would be a major advance, as it would provide 3D visualization of the vocal tract shaping dynamics, especially for the modeling o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999